Genetic Epidemiology
○ Wiley
All preprints, ranked by how well they match Genetic Epidemiology's content profile, based on 14 papers previously published here. The average preprint has a 0.03% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.
Hatton, A.; Brito Nunes, C.; Lawlor, D.; Evans, D.
Show abstract
Offspring can exert profound effects on the health of their parents. This is perhaps most apparent during the perinatal period, where the fetus influences processes that alter pre- and post-natal maternal physiology. In theory, it is possible to investigate the causal effect of offspring traits on parental health outcomes using Mendelian randomisation (MR), however, as parental and offspring genotypes are correlated, analyses need to be adjusted for the parents genotype to avoid confounding through the parental genome. Such analyses are difficult to perform at scale because of the paucity of cohorts across the world with large numbers of genotyped maternal- or paternal-offspring dyads and parent-offspring trios. In this manuscript, we explain how the causal effects of offspring traits on parental health outcomes can be investigated using Mendelian randomization (MR) and discuss the challenges in implementing such designs. We introduce the "offspring genotype by proxy" MR framework which can be employed in the absence of offspring genetic information to complement existing approaches in the triangulation of causal inference. The basic idea is to use parental genotypes to proxy the direct effect of their offsprings genotype on their offsprings own exposures. Specifically, we show how it is possible to proxy offspring genotype with paternal genotype when investigating causal effects of offspring traits on maternal health outcomes (and vice versa for paternal outcomes), which minimises the problem of confounding from the relevant parents genotype. We compare our framework to other MR designs that might be used to explore effects of offspring traits on parental health and investigate the consequences of model misspecification and spousal misclassification on statistical power and consistency. Given the increasing availability of datasets like the UK Biobank that (incidentally) include tens of thousands of genome-wide genotyped spousal pairs as well as large population based biobanks with linked health record data for first-degree relatives, we conclude that the offspring genotype by proxy MR approach could augment causal analyses of offspring exposures on their parents outcomes as implementation is not restricted to datasets with parent-offspring genotype information.
Spiller, W.; Hartwig, F. P.; Sanderson, E.; Davey Smith, G.; Bowden, J.
Show abstract
Studies leveraging gene-environment (GxE) interactions within Mendelian randomization (MR) analyses have prompted the emergence of two methodologies: MR-GxE and MR-GENIUS. Such methods are attractive in allowing for pleiotropic bias to be corrected when using individual instruments. Specifically, MR-GxE requires an interaction to be explicitly identified, while MR-GENIUS does not. We critically examine the assumptions of MR-GxE and MR-GENIUS, and propose sensitivity analyses to evaluate their performance. Finally, we explore the association between body mass index (BMI) and systolic blood pressure (SBP) using data from the UK Biobank. We find both approaches share similar assumptions, though differences between the approaches lend themselves to differing research settings. Where interactions are identified, MR-GxE relies on weaker assumptions and allows for further sensitivity analyses. MR-GENIUS circumvents the need to identify interactions, but relies on the MR-GxE assumptions holding globally. Through applied analyses we find evidence of a positive effect of BMI upon SBP.
Dudbridge, F.; Voller, B.; Woodward, R. M.; Frayling, T.; Pilling, L. C.; Bowden, J.
Show abstract
Mendelian Randomisation Egger regression (MR-Egger) is a popular method for causal inference using single-nucleotide polymorphisms (SNPs) as instrumental variables. It allows all SNPs to have direct pleiotropic effects on the outcome, provided that those effects are independent of the effects on the exposure, known as the InSIDE assumption. However, the results of MR-Egger, and the InSIDE assumption itself, are sensitive to which allele is coded as the effect allele for each SNP. A pragmatic convention is to code the alleles with positive effects on the exposure, which has some advantages in interpretation but some statistical limitations. Here we show that if the InSIDE assumption holds under all-positive coding of the exposure effects, it cannot hold under all-positive coding of the pleiotropic effects, and argue that this undermines the soundness of MR-Egger. We propose a modification that has the Genotype Recoding Invariance Property (GRIP), achieving the main aim of MR-Egger without the difficulties of allele coding. Our approach, MR-GRIP, is valid under a "Variance independent of covariance explained" assumption (VICE), which amounts to an inverse relationship between exposure effects and pleiotropic effects. Examples and simulations suggest that MR-GRIP can reconcile differences between MR-Egger and alternative methods. Author summaryMendelian Randomisation (MR) is a statistical method that can distinguish causal relationships from statistical correlations, under certain assumptions. The principle is to use genetic markers, such as single-nucleotide polymorphisms (SNPs), as proxies for the causal variable. One version of MR, called MR-Egger, is very popular but has a serious drawback in that its results depend on how the SNPs are numerically encoded. We propose a modification that has the Genotype Recoding Invariance Property (GRIP), which avoids this problem whilst achieving the main aim of MR-Egger. We illustrate our approach, called MR-GRIP, in simulations and in real data examples including the effect of serum urate on coronary heart disease (CHD), the effect of body mass index on coronary artery disease, and the joint effects of plasma lipids on CHD. In each case, MR-GRIP gives plausible results, and in some cases, it appears to reconcile differences between MR-Egger and alternative methods for MR.
Leyden, G. M.; Pagoni, P.; Power, G. M.; Carslake, D.; Richardson, T. G.; Tilling, K.; Hemani, G.; Davey Smith, G.; Sanderson, E.
Show abstract
Genome-wide association studies (GWAS) are conventionally conducted in cohorts spanning a wide age-range. These studies typically assume that genetic associations are constant across different ages. Some traits, however, may have age-varying genetic associations. This has implications for the interpretation of genetic effects derived in downstream applications, such as Mendelian randomization (MR) analyses. In this study we conducted a series of age-stratified GWAS on individuals aged 40-69 years in the UK Biobank, for body-mass index (BMI) and three blood pressure traits (systolic, diastolic and pulsatile pressure (PP)) in 2-year age strata (N up to 26,330). We used a meta-regression approach to systematically identify single nucleotide polymorphisms (SNPs) with evidence for age interaction effects among trait-associated GWAS signals and additional loci genome-wide. Within an MR framework, we examine the relationship between BMI and blood pressure traits on cardiovascular and cardiometabolic outcomes (type-2 diabetes (T2D), stroke, peripheral artery disease (PAD), heart failure, coronary heart disease and atrial fibrillation). Next, we describe the effect of the SNP*Age interaction on those relationships in a modified inverse-variance weighted (ivw) analysis. We identified differential enrichment of age-interaction effects, which was trait dependent. For example, 10.3% of BMI discovery SNPs had evidence for an age-interaction in our data compared to 44.7% for PP (at P<0.05). Our downstream MR and modified ivw analyses highlight the influence of age on the genetically predicted relationship between PP and adverse cardiovascular outcomes. For example, our results indicated that an increased rate of change in genetically predicted PP across the age period is associated with higher susceptibility to PAD (interaction odds ratio= 2.71; P=1.82x10-13; 95%-CI: 2.08-3.53). The data generated in this project provides a valuable resource for further exploration of mechanisms relevant to the genetic architecture of complex traits and all summary data will be made readily accessible to the research community. Author SummaryGenetic variants which reliably predict variation in a trait are a valuable tool within genetic epidemiology studies, offering a means to estimate whether an exposure-outcome relationship is likely to be causal using a method called Mendelian randomization (MR). Typically, MR results are interpreted as the cumulative lifetime effect of the exposure on the outcome. However, there is growing evidence which suggests that the influence of genetic effects on trait variation detected in cross-sectional population studies may be age dependent in some scenarios. In this work we aimed to conduct a thorough investigation on whether and to what extent the influence of genetics on population-level trait variation changes across adulthood. We investigated this question within a methodological framework which used age-stratified summary level data, demonstrating that this approach may have wide applicability to the research community where individual level cohort data are not publicly available. We demonstrate that age interacts with genetic influences across adulthood in a trait dependent manner, where genetics may have a stronger influence on variation in body-mass index measured earlier in life, and on pulsatile pressure later in life. We take advantage of the MR and ivw frameworks to further illustrate how the variation in the exposure explained by genetics varies with increasing age. This exploratory work helps provide insight on the extent that distinct genetic effects are detectable across adulthood, helping us to understand how more precise lifecourse effects may be genetically proxied within an MR setting.
Mukhopadhyay, N.; Feingold, E. E.; Brand, H.; Lee, M. K.; Kurtas, E. N.; Sanchis-Juan, A.; Moreno-Uribe, L.; Wehby, G.; Valencia-Ramirez, L. C.; Restrepo Muneton, C. P.; Padilla, C.; Deleyiannis, F.; Poletta, F. A.; Orioli, I. M.; Hecht, J. T.; Buxo, C. J.; Butali, A.; Adeyemo, W. L.; Abebe, M. E.; Vieira, A. R.; Shaffer, J. R.; Murray, J. C.; Weinberg, S. M.; Ruczinski, I.; Leslie-Clarkson, E. J.; Marazita, M. L.
Show abstract
ObjectiveOur understanding of the genetic causes of non-syndromic orofacial clefts (OFCs) is based largely upon genetic studies of common and rare nucleotide variants. Less is known about the role of copy number variations (CNVs) and the studies published to date have been limited to either small samples or targeted genomic regions. The objective of our study is to investigate the contribution of CNVs spread across the entire genome to OFC risk in a large multi-ancestry cohort. MethodsWe utilized PennCNV on microarray genotyping data to detect CNVs in 10,240 participants (2,484 with clefts, 7,756 unaffected). 70,695 quality-filtered autosomal CNVs (49,660 deletions, 21,035 duplications) were used to assign normal/abnormal copy number statuses at 67,199 positions from the GRCh37 genome assembly. Genome-wide association was run between cleft status and copy number status. ResultsWe observed a highly significant association between OFCs and deletions on chromosome 7p14.1 (p=1.32e-35) driven by Central and South American ancestry (p=1.04e-25) participants, with less significant contributions from European (p=3.37e-08) and Asian (p=0.01) ancestry participants. We also observed four other loci with p-values below 10e-04. ConclusionThe 7p14.1 association observed in our study is a replication of two prior studies in independent cohorts of European ancestry. However, this locus lies in a T-cell receptor region that is subject to somatic rearrangements that decrease in frequency with age and may affect genetic association results. Our data show age effects as well as differences between blood and saliva samples. Thus, our results can be interpreted either as supporting a previously established association with orofacial clefts, or as questioning those previous results in favor of a hypothesis about the behavior of somatic rearrangements in T-cell receptor regions.
Xu, Z. M.; Burgess, S.
Show abstract
Mendelian randomization is the use of genetic variants to assess the effect of intervening on a risk factor using observational data. We consider the scenario in which there is a pharmacomimetic (that is, treatment-mimicking) genetic variant that can be used as a proxy for a particular pharmacological treatment that changes the level of the risk factor. If the association of the pharmacomimetic genetic variant with the risk factor is stronger in one subgroup of the population, then we may expect the effect of the treatment to be stronger in that subgroup. We test for gene-gene interactions in the associations of variants with a modifiable risk factor, where one genetic variant is treated as pharmacomimetic and the other as an effect modifier, to find genetic sub-groups of the population with different predicted response to treatment. If individual genetic variants that are strong effect modifiers cannot be found, moderating variants can be combined using a random forest of interaction trees method into a polygenic response score, analogous to a polygenic risk score for risk prediction. We illustrate the application of the method to investigate effect heterogeneity in the effect of statins on low-density lipoprotein cholesterol.
Crick, D.; Medland, S.; Davey Smith, G.; Evans, D.
Show abstract
Hand preference first appears in utero, yet twin studies and GWAS show that the majority of variance in hand preference is explained by environmental factors. Using UK Biobank data and multivariable logistic regression to test associations between potential causes of handedness and offspring hand preference, we found maternal smoking during pregnancy increased the probability of being right-handed after adjustment for covariates. Using a proxy gene-by-environment (GxE) Mendelian randomization design we investigated the potential causal effect of maternal smoking during pregnancy on offspring handedness. We used rs16969968 in the CHRNA5 gene and a polygenic risk score of genome-wide significant smoking-heaviness variants to proxy smoking behaviour. We stratified based on reported maternal smoking during pregnancy because, regardless of genotype, any causal effect of maternal smoking on offspring handedness should only manifest in individuals whose mothers smoked during pregnancy. The GxE MR analyses found no causal effect of maternal smoking during pregnancy on offspring hand-preference. Our study contributes to the understanding of hand preference and its potential early-life determinants. However, the main factors contributing to variation in hand preference remain unresolved.
Gkatzionis, A.; Davey Smith, G.; Tilling, K.
Show abstract
Mendelian randomization is currently mainly implemented through the use of genetic variants as instrumental variables to investigate the causal effect of an exposure on an outcome of interest. Mendelian randomization studies are robust to confounding bias and reverse causation, but they remain susceptible to selection bias; for example, this can happen if the exposure or outcome are associated with selection into the study sample. Negative controls are sometimes used to detect biases (typically due to confounding) in observational studies. Here, we focus specifically on Mendelian randomization analyses and discuss under what conditions a variable can be used as a negative control outcome to detect selection mechanisms that could bias Mendelian randomization estimates. We show that the main requirement is that the negative control outcome relates to confounders of the exposure and outcome. Counter-intuitively, the effect of the negative control on selection is of secondary concern; for example, a variable that does not affect selection can be a valid negative control for an outcome that does. We also investigate under what conditions age and sex can be used as negative control outcomes in Mendelian randomization analyses. In a real-data application, we investigate the pairwise causal relationships between 19 traits, utilizing data from the UK Biobank. Treating biological sex as a negative control outcome, we identify selection bias in analyses involving commonly used traits such as alcohol consumption, body mass index and educational attainment.
Woolf, B.; Cronje, H. T.; Zagkos, L.; Larsson, S. C.; Gill, D.; Burgess, S.
Show abstract
Drug-target Mendelian randomization (MR) is a popular approach for exploring the effects of pharmacological targets. Cis-MR designs select variants within the gene region that code for a protein of interest to mimic pharmacological perturbation. An alternative uses variants associated with behavioral proxies of target perturbation, such as drug usage. Both have been employed to investigate the effects of caffeine but have drawn different conclusions. We use the effects of caffeine on body mass index (BMI) as a case study to highlight two potential flaws of the latter strategy in drug-target MR: misidentifying the exposure and using invalid instruments. Some variants associate with caffeine consumption because of their role in caffeine metabolism. Since people with these variants require less caffeine for the same physiological effect, the direction of the caffeine-BMI association is flipped depending on whether estimates are scaled by caffeine consumption or plasma caffeine levels. Other variants seem to associate with caffeine consumption via behavioral pathways. Using multivariable-MR, we demonstrate that caffeine consumption behavior influences BMI independently of plasma caffeine. This implies the existence of behaviorally mediated exclusion restriction violations. Our results support the superiority of cis-MR study designs in pharmacoepidemiology over the use of behavioral proxies of drug targets.
Parker, R. M. A.; Leckie, G.; Goldstein, H.; Howe, L. D.; Heron, J.; Hughes, A. D.; Phillippo, D. M.; Tilling, K.
Show abstract
Within-individual variability of repeatedly-measured exposures may predict later outcomes: e.g. blood pressure (BP) variability (BPV) is an independent cardiovascular risk factor above and beyond mean BP. Since two-stage methods, known to introduce bias, are typically used to investigate such associations, we introduce a joint modelling approach, examining associations of both mean BP and BPV across childhood to left ventricular mass (indexed to height; LVMI) in early adulthood with data from the UKs Avon Longitudinal Study of Parents and Children (ALSPAC) cohort. Using multilevel models, we allow BPV to vary between individuals (a "random effect") as well as to depend on covariates (allowing for heteroscedasticity). We further distinguish within-clinic variability ("measurement error") from visit-to-visit BPV. BPV was predicted to be greater at older ages, at higher bodyweights, and in females, and was positively correlated with mean BP. BPV had a positive association with LVMI (10% increase in SD(BP) was predicted to increase LVMI by mean = 0.42% (95% credible interval: -0.47%, 1.38%)), but this association became negative (mean = -1.56%, 95% credible interval: -5.01%, 0.44%)) once the effect of mean BP on LVMI was adjusted for. This joint modelling approach offers a flexible method of relating repeatedly-measured exposures to later outcomes.
SHI, Y.; Xiang, Y.; YE, Y.; HE, T.; SHAM, P.-C.; So, H.-C.
Show abstract
Mendelian Randomization (MR), a method that employs genetic variants as instruments for causal inference, has gained popularity in assessing the causal effects of risk factors. However, almost all MR studies primarily concentrate on the populations average causal effects. With the advent of precision medicine, the individualized treatment effect (ITE) is often of greater interest. For instance, certain risk factors may pose a higher risk to some individuals compared to others, and the benefits of a treatment may vary among individuals. This highlights the importance of considering individual differences in risk and treatment response. We propose a new framework that expands the concept of MR to investigate individualized causal effects. We presented several approaches for estimating Individualized Treatment Effects (ITEs) within this MR framework, primarily grounded on the principles of the"R-learner". To evaluate the existence of causal effect heterogeneity, we proposed two permutation testing methods. We employed Polygenic Risk Scores (PRS) as the instrument and demonstrated that the removal of potentially pleiotropic SNPs could enhance the accuracy of ITE estimates. The validity of our approach was substantiated through comprehensive simulations. We applied our framework to study the individualized causal effect of various lipid traits, including Low-Density Lipoprotein Cholesterol (LDL-C), High-Density Lipoprotein Cholesterol (HDL-C), Triglycerides (TG), and Total Cholesterol (TC), on the risk of Coronary Artery Disease (CAD) using data from the UK Biobank. Our findings indicate that an elevated level of LDL-C is causally linked to increased CAD risks, with the effect demonstrating significant heterogeneity. Similar results were observed for TC. We also revealed clinical factors contributing to the heterogeneity of ITE based on Shapley value analysis. Furthermore, we identified clinical factors contributing to the heterogeneity of ITEs through Shapley value analysis. This underscores the importance of individualized treatment plans in managing CAD risks.
Prijatelj, V.; Grgic-Chavez, O.; van der Tas, J.; Andaur Navarro, C. L.; Uitterlinden, A. G.; Rivadeneira, F.; Wolvius, E. B.; Medina-Gomez, C.
Show abstract
ObjectiveThe panoramic mandibular index (PMI) and mental index (MI) assessed on dental panoramic radiographs (DPRs) have been postulated as useful for the assessment of adult bone health. However, their utility in children remains to be determined. Our objective was to establish genetic determinants of the PMI/MI and to evaluate the relationship between these indices and total body less-head bone mineral density (TBLH-BMD) by leveraging data from medical records and genetic profiles of Dutch children. Study designThis study was embedded in the Generation R Study including 3,518 participants at a mean age of 13 years. BMD was obtained from dual-energy X-ray (DXA) scans, while radiomorphometric measurements of the mandibular bone were obtained from DPRs. Genome-wide association studies (GWAS) on PMI/MI were performed using individual genotyped data imputed to the 1000 Genomes reference panel. The association between PMI/MI and BMD was comprehensively assessed following a combined observational and genetic analysis, both corrected for biological covariates such as sex, age, and others, using a BMD polygenic risk score (PGS) in pubescent children. ResultsThe PMI and MI GWAS identified an associated signal (p=2.53x10-9) mapping to the ODF3/BET1L/RIC8A/SIRT3 locus, previously associated with BMD. Moreover, significant differences in PMI and MI were observed across the extremes of the TBLH-BMD PGS distribution. Our results also show that a standard deviation (SD) increase in measured TBLH-BMD was associated with 0.244 SD increase [95% CI 0.211 - 0.277, p<0.001] in PMI and 0.426 SD increase in MI (95% CI 0.395 - 0.457, p<0.001). ConclusionAltogether, our results suggest that PMI/MI and BMD partially share common biological pathways, and the former may constitute a relevant marker if screening for children with impaired bone health using DPRs.
Sum, K. K.; Hughes, A. M.; Havdahl, A.; Davey Smith, G.; Howe, L. D.
Show abstract
Mendelian randomization (MR) uses genetic variants as instrumental variables to enhance causal inference. Studies have identified genetic variants related to childhood maltreatment, but interpreting the effects of these variants or assessing the plausibility of MR assumptions is complex. We aim to investigate the feasibility of applying MR to complex social traits using the association between childhood maltreatment and mental health and behavioral outcomes as an example. We explore four potential key concerns: confounding by population phenomena, horizontal and vertical pleiotropy, reverse causality, and selection. For each concern, we demonstrate scenarios where MR studies of childhood maltreatment may be biased using DAGs and critical appraisal of previous MR analyses. For confounding by population phenomena, we further perform within-family genetic analyses in 42,101 parent-offspring trios from the Norwegian Mother, Father and Child Cohort Study (MoBa) to address bias due to family-level processes since childhood maltreatment often occurs within households. Our results showed same-trait shrinkage (11% attenuation of the association between childrens polygenic risk scores of childhood maltreatment (PRSCM) and mothers report of childrens physical abuse) but not cross-trait shrinkage (childrens PRSCM and childrens mental health and behavioral outcomes) after adjusting for parental PRSCM. The lack of cross-trait shrinkage suggests that genetic variants related to childhood maltreatment may be capturing other child-level phenotypes, after adjusting for family-level processes. Mothers PRSCM were also associated with mothers own maltreatment experiences in childhood and adulthood with similar magnitudes, suggesting these genetic effects are not specific to childhood maltreatment. Due to the complexity involved in the causal chain of childhood maltreatment and it being reported, the interpretation of MR studies for childhood maltreatment is challenging. Other causal approaches should be considered for observational studies of complex social traits.
Chadha, M.; Bell, J.; Sanderson, E.
Show abstract
BackgroundNumerous observational studies have shown an association between higher circulating 25 hydroxyvitamin D (vitamin D) and lower body mass index (BMI). Whether this represents a causal effect remains unclear. Mendelian randomization (MR) is an approach to causal inference that uses genetic variants as instrumental variables to estimate the effect of exposures on outcomes of interest. MR estimates are not biased by confounding, reverse causation and other biases in the same way as conventional observational estimates. In this study, we used MR with new data on genetic variants associated with vitamin D to estimate the effect of vitamin D on BMI. MethodsWe selected single nucleotide polymorphisms (SNPs) which were associated with vitamin D in a recent large genome-wide association study (GWAS) at genome-wide significance as instruments for vitamin D. We used inverse variance weighted models and further assessed individual SNPs that showed evidence of an effect, and biologically informed SNPs located in genetic regions previously associated with vitamin D, for associations with other traits at genome-wide significance, using Wald ratio estimation. ResultOur main results showed no evidence of an effect of vitamin D on BMI (estimated standard deviation change in BMI per standard deviation change in vitamin D: -0.003, 95% confidence interval [-0.06, 0.06]). This was also supported by pleiotropy robust sensitivity analyses. Individual SNPs that showed evidence of an effect of vitamin D on either lower or higher BMI were strongly associated with numerous other traits suggesting high levels of horizontal pleiotropy. Biologically informed SNPs showed no evidence of a causal effect of vitamin D on BMI and showed substantially less evidence of pleiotropic effects. ConclusionThe observed association between vitamin D and BMI is unlikely to be due to a causal effect of vitamin D on BMI. We also show how additional evidence can be incorporated into an MR study to interrogate individual SNPs for potential pleiotropy and improve interpretation of results.
Battram, T.; Gaunt, T. R.; Relton, C. L.; Timpson, N. J.; Hemani, G.
Show abstract
Identifying the genes, properties of these genes and pathways to understand the underlying biology of complex traits responsible for differential health states in the population is a common goal of epigenome-wide and genome-wide association studies (EWAS and GWAS). GWAS identify genetic variants that effect the trait of interest or variants that are in linkage disequilibrium with the true causal variants. EWAS identify variation in DNA methylation, a complex molecular phenotype, associated with the trait of interest. Therefore, while GWAS in principle will only detect variants within or near causal genes, EWAS can also detect genes that confound the association between a trait and a DNA methylation site, or are reverse causal. Here we systematically compare association EWAS and GWAS results of 14 complex traits (N > 4500). A small fraction of detected genomic regions were shared by both EWAS and GWAS (0-9%). We evaluated if the genes or gene ontology terms flagged by GWAS and EWAS overlapped, and after a multiple testing correction, found substantial overlap for diastolic blood pressure (gene overlap P = 5.2x10-6, term overlap P = 0.001). We superimposed our empirical findings against simulated models of varying genetic and epigenetic architectures and observed that in a majority of cases EWAS and GWAS are likely capturing distinct genesets, implying that genes identified by EWAS are not generally causally upstream of the trait. Overall our results indicate that EWAS and GWAS are capturing different aspects of the biology of complex traits.
Ngwa, J. S.; Yanek, L. R.; Kammers, K.; Kanchan, K.; Taub, M. A.; Scharpf, R. B.; Faraday, N.; Becker, L. C.; Mathias, R. A.; Ruczinski, I.
Show abstract
Genome-wide association studies (GWAS) have successfully identified thousands of single nucleotide polymorphisms (SNPs) associated with complex traits; however, the identified SNPs account for a fraction of trait heritability, and identifying the functional elements through which genetic variants exert their effects remains a challenge. Recent evidence suggests that SNPs associated with complex traits are more likely to be expression quantitative trait loci (eQTL). Thus, incorporating eQTL information can potentially improve power to detect causal variants missed by traditional GWAS approaches. Using genomic, transcriptomic, and platelet phenotype data from the Genetic Study of Atherosclerosis Risk family-based study, we investigated the potential to detect novel genomic risk loci by incorporating information from eQTL in the relevant target tissues (i.e. platelets and megakaryocytes). Permutation analyses were performed to obtain family-wise error rates for eQTL associations, substantially lowering the genome-wide significance threshold for SNP-phenotype associations. In addition to confirming the well known association between PEAR1 and platelet aggregation, our eQTL focused approach identified a novel locus (rs1354034) and gene (ARHGEF3) not previously identified in a GWAS of platelet aggregation phenotypes. A colocalization analysis showed strong evidence for a functional role of this eQTL.
Adjangba, C.; Border, R.; Romero Villela, P. N.; Ehringer, M. A.; Evans, L. M.
Show abstract
Tobacco smoking is the leading cause of preventable death globally. Smoking quantity, measured in cigarettes per day (CPD), is influenced both by the age of onset of regular smoking (AOS) and by genetic factors, including a strong effect of the non-synonymous single nucleotide polymorphism rs16969968. A previous study by Hartz et al. reported an interaction between these two factors, whereby rs16969968 risk allele carriers who started smoking earlier showed increased risk for heavy smoking compared to those who started later. This finding has yet to be replicated in a large, independent sample. We performed a preregistered, direct replication attempt of the rs16969968xAOS interaction on smoking quantity in 128,383 unrelated individuals from the UK Biobank, meta-analyzed across ancestry groups. We fit statistical association models mirroring the original publication as well as formal interaction tests on multiple phenotypic and analytical scales. We replicated the main effects of rs16969968 and AOS on CPD but failed to replicate the interaction using previous methods. Nominal significance of the rs16969968xAOS interaction term depended strongly on the scale of analysis and the particular phenotype, as did associations stratified by early/late AOS. No interaction tests passed genome-wide correction (=5e-8), and all estimated interaction effect sizes were much smaller in magnitude than previous estimates. We failed to replicate the strong rs16969968xAOS interaction effect previously reported. If such gene-moderator interactions influence complex traits, they likely depend on scale of measurement, and current biobanks lack the power to detect significant genome-wide associations given the minute effect sizes expected. IMPLICATIONSWe failed to replicate the strong rs16969968xAOS interaction effect on smoking quantity previously reported. If such gene-moderator interactions influence complex traits, current biobanks lack the power to detect significant genome-wide associations given the minute effect sizes expected. Furthermore, many potential interaction effects are likely to depend on the scale of measurement employed.
Sun, B.; Liu, Z.; Tchetgen Tchetgen, E.
Show abstract
Mendelian randomization (MR) is a popular instrumental variable (IV) approach, in which genetic markers are used as IVs. In order to improve efficiency, multiple markers are routinely used in MR analyses, leading to concerns about bias due to possible violation of IV exclusion restriction of no direct effect of any IV on the out-come other than through the exposure in view. To address this concern, we introduce a new class of Multiply Robust MR (MR2) estimators that are guaranteed to remain consistent for the causal effect of interest provided that at least one genetic marker is a valid IV without necessarily knowing which IVs are invalid. We show that the proposed MR2 estimators are a special case of a more general class of estimators that remain consistent provided that a set of at least k{dagger} out of K candidate instrumental variables are valid, for k{dagger}[≤] K set by the analyst ex ante, without necessarily knowing which IVs are invalid. We provide formal semiparametric theory supporting our results, and characterize the semiparametric efficiency bound for the exposure causal effect which cannot be improved upon by any regular estimator with our favorable robustness property. We conduct extensive simulation studies and apply our methods to a large-scale analysis of UK Biobank data, demonstrating the superior empirical performance of MR2 compared to competing MR methods.
Sanderson, E.; Rosoff, D.; Palmer, T.; Tilling, K.; Davey Smith, G.; Hemani, G.
Show abstract
Mendelian randomization (MR) uses genetic variants to estimate the causal effect of an exposure on an outcome in the presence of unmeasured confounding. A key assumption of MR is that the genetic variants used influence the outcome only through the exposure. Violation of this assumption undermines the gene-environment equivalence principle, which posits that modifying the exposure via genetic variation is equivalent to modifying it through environmental factors. With increasing sample sizes in genome-wide association studies genetic instruments with smaller effect sizes are being identified as associated with a trait. Through simulation studies, we demonstrate that such variants may have greater liability to act through confounders of the exposure and outcome in a MR study, biasing effect estimates. This bias acts in the same direction as the confounded associations observed in linear regression, but often with greater magnitude and acts in the same direction across all of the most commonly used MR estimation methods, potentially leading to misleading confidence in the results. We further show that the magnitude of bias escalates as the proportion of genetic instruments associated with confounders increases. Importantly, when potential heritable confounders the genetic variants act through are known and can be instrumented, unbiased causal estimates can be obtained through pre-estimation filtering or by employing multivariable MR and adjusting for the confounder. We illustrate our approach through an application to estimate the effect of C Reactive protein on type 2 diabetes using a hypothesis free approach to identify and remove the effect of potential heritable confounders.
Pott, J.; Palma, M.; Liu, Y.; Mack, J. A.; Sovio, U.; Smith, G. C. S.; Barrett, J.; Burgess, S.
Show abstract
Background and aimMendelian Randomization (MR) is a widely used tool to estimate causal effects using genetic variants as instrumental variables. MR is limited to cross-sectional summary statistics of different samples and time points to analyse time-varying effects. We aimed at using longitudinal summary statistics for an exposure in a multivariable MR setting and validating the effect estimates for the mean, slope and within-individual variability. Simulation studyWe tested our approach in twelve scenarios for power and type I error, depending on shared instruments between the mean, slope and variability, and regression model specifications. We observed high power to detect causal effects of the mean and slope throughout the simulation, but the variability effect was low powered in case of shared SNPs between the mean and variability. Mis-specified regression models led to lower power and increased the type I error. Real data applicationWe applied our approach to two real data sets (POPS, UK Biobank). We detected significant causal estimates for both the mean and the slope in both cases, but no independent effect of the variability. However, we only had weak instruments in both data sets. ConclusionWe used a new approach to test a time-varying exposure for causal effects of the exposures mean, slope and variability. The simulation with strong instruments seems promising but also highlights three crucial points: 1) the difficulty to define the correct exposure regression model, 2) the dependency on the genetic correlation, and 3) the lack of strong instruments in real data. Taken together, this demands a cautious evaluation of the results, accounting for known biology and the trajectory of the exposure.